Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 442
Filtrar
1.
bioRxiv ; 2024 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-38562829

RESUMO

The secreted mucins MUC5AC and MUC5B play critical defensive roles in airway pathogen entrapment and mucociliary clearance by encoding large glycoproteins with variable number tandem repeats (VNTRs). These polymorphic and degenerate protein coding VNTRs make the loci difficult to investigate with short reads. We characterize the structural diversity of MUC5AC and MUC5B by long-read sequencing and assembly of 206 human and 20 nonhuman primate (NHP) haplotypes. We find that human MUC5B is largely invariant (5761-5762aa); however, seven haplotypes have expanded VNTRs (6291-7019aa). In contrast, 30 allelic variants of MUC5AC encode 16 distinct proteins (5249-6325aa) with cysteine-rich domain and VNTR copy number variation. We grouped MUC5AC alleles into three phylogenetic clades: H1 (46%, ~5654aa), H2 (33%, ~5742aa), and H3 (7%, ~6325aa). The two most common human MUC5AC variants are smaller than NHP gene models, suggesting a reduction in protein length during recent human evolution. Linkage disequilibrium (LD) and Tajima's D analyses reveal that East Asians carry exceptionally large MUC5AC LD blocks with an excess of rare variation (p<0.05). To validate this result, we used Locityper for genotyping MUC5AC haplogroups in 2,600 unrelated samples from the 1000 Genomes Project. We observed signatures of positive selection in H1 and H2 among East Asians and a depletion of the likely ancestral haplogroup (H3). In Africans and Europeans, H3 alleles show an excess of common variation and deviate from Hardy-Weinberg equilibrium, consistent with heterozygote advantage and balancing selection. This study provides a generalizable strategy to characterize complex protein coding VNTRs for improved disease associations.

2.
bioRxiv ; 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-38645259

RESUMO

The crab-eating macaques ( Macaca fascicularis ) and rhesus macaques ( M. mulatta ) are widely studied nonhuman primates in biomedical and evolutionary research. Despite their significance, the current understanding of the complex genomic structure in macaques and the differences between species requires substantial improvement. Here, we present a complete genome assembly of a crab-eating macaque and 20 haplotype-resolved macaque assemblies to investigate the complex regions and major genomic differences between species. Segmental duplication in macaques is ∼42% lower, while centromeres are ∼3.7 times longer than those in humans. The characterization of ∼2 Mbp fixed genetic variants and ∼240 Mbp complex loci highlights potential associations with metabolic differences between the two macaque species (e.g., CYP2C76 and EHBP1L1 ). Additionally, hundreds of alternative splicing differences show post-transcriptional regulation divergence between these two species (e.g., PNPO ). We also characterize 91 large-scale genomic differences between macaques and humans at a single-base-pair resolution and highlight their impact on gene regulation in primate evolution (e.g., FOLH1 and PIEZO2 ). Finally, population genetics recapitulates macaque speciation and selective sweeps, highlighting potential genetic basis of reproduction and tail phenotype differences (e.g., STAB1 , SEMA3F , and HOXD13 ). In summary, the integrated analysis of genetic variation and population genetics in macaques greatly enhances our comprehension of lineage-specific phenotypes, adaptation, and primate evolution, thereby improving their biomedical applications in human diseases.

3.
J Neurodev Disord ; 16(1): 15, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38622540

RESUMO

BACKGROUND: Neurodevelopmental conditions such as intellectual disability (ID) and autism spectrum disorder (ASD) can stem from a broad array of inherited and de novo genetic differences, with marked physiological and behavioral impacts. We currently know little about the psychiatric phenotypes of rare genetic variants associated with ASD, despite heightened risk of psychiatric concerns in ASD more broadly. Understanding behavioral features of these variants can identify shared versus specific phenotypes across gene groups, facilitate mechanistic models, and provide prognostic insights to inform clinical practice. In this paper, we evaluate behavioral features within three gene groups associated with ID and ASD - ADNP, CHD8, and DYRK1A - with two aims: (1) characterize phenotypes across behavioral domains of anxiety, depression, ADHD, and challenging behavior; and (2) understand whether age and early developmental milestones are associated with later mental health outcomes. METHODS: Phenotypic data were obtained for youth with disruptive variants in ADNP, CHD8, or DYRK1A (N = 65, mean age = 8.7 years, 40% female) within a long-running, genetics-first study. Standardized caregiver-report measures of mental health features (anxiety, depression, attention-deficit/hyperactivity, oppositional behavior) and developmental history were extracted and analyzed for effects of gene group, age, and early developmental milestones on mental health features. RESULTS: Patterns of mental health features varied by group, with anxiety most prominent for CHD8, oppositional features overrepresented among ADNP, and attentional and depressive features most prominent for DYRK1A. For the full sample, age was positively associated with anxiety features, such that elevations in anxiety relative to same-age and same-sex peers may worsen with increasing age. Predictive utility of early developmental milestones was limited, with evidence of early language delays predicting greater difficulties across behavioral domains only for the CHD8 group. CONCLUSIONS: Despite shared associations with autism and intellectual disability, disruptive variants in ADNP, CHD8, and DYRK1A may yield variable psychiatric phenotypes among children and adolescents. With replication in larger samples over time, efforts such as these may contribute to improved clinical care for affected children and adolescents, allow for earlier identification of emerging mental health difficulties, and promote early intervention to alleviate concerns and improve quality of life.


Assuntos
Transtorno do Espectro Autista , Deficiência Intelectual , Transtornos do Neurodesenvolvimento , Adolescente , Criança , Feminino , Humanos , Masculino , Transtorno do Espectro Autista/complicações , Proteínas de Ligação a DNA/genética , Proteínas de Homeodomínio/genética , Deficiência Intelectual/genética , Deficiência Intelectual/complicações , Saúde Mental , Proteínas do Tecido Nervoso/genética , Transtornos do Neurodesenvolvimento/genética , Transtornos do Neurodesenvolvimento/complicações , Qualidade de Vida , Fatores de Transcrição/genética
4.
Nature ; 2024 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-38570684

RESUMO

Human centromeres have been traditionally very difficult to sequence and assemble owing to their repetitive nature and large size1. As a result, patterns of human centromeric variation and models for their evolution and function remain incomplete, despite centromeres being among the most rapidly mutating regions2,3. Here, using long-read sequencing, we completely sequenced and assembled all centromeres from a second human genome and compared it to the finished reference genome4,5. We find that the two sets of centromeres show at least a 4.1-fold increase in single-nucleotide variation when compared with their unique flanks and vary up to 3-fold in size. Moreover, we find that 45.8% of centromeric sequence cannot be reliably aligned using standard methods owing to the emergence of new α-satellite higher-order repeats (HORs). DNA methylation and CENP-A chromatin immunoprecipitation experiments show that 26% of the centromeres differ in their kinetochore position by >500 kb. To understand evolutionary change, we selected six chromosomes and sequenced and assembled 31 orthologous centromeres from the common chimpanzee, orangutan and macaque genomes. Comparative analyses reveal a nearly complete turnover of α-satellite HORs, with characteristic idiosyncratic changes in α-satellite HORs for each species. Phylogenetic reconstruction of human haplotypes supports limited to no recombination between the short (p) and long (q) arms across centromeres and reveals that novel α-satellite HORs share a monophyletic origin, providing a strategy to estimate the rate of saltatory amplification and mutation of human centromeric DNA.

5.
bioRxiv ; 2024 Mar 13.
Artigo em Inglês | MEDLINE | ID: mdl-38654825

RESUMO

TBC1D3 is a primate-specific gene family that has expanded in the human lineage and has been implicated in neuronal progenitor proliferation and expansion of the frontal cortex. The gene family and its expression have been challenging to investigate because it is embedded in high-identity and highly variable segmental duplications. We sequenced and assembled the gene family using long-read sequencing data from 34 humans and 11 nonhuman primate species. Our analysis shows that this particular gene family has independently duplicated in at least five primate lineages, and the duplicated loci are enriched at sites of large-scale chromosomal rearrangements on chromosome 17. We find that most humans vary along two TBC1D3 clusters where human haplotypes are highly variable in copy number, differing by as many as 20 copies, and structure (structural heterozygosity 90%). We also show evidence of positive selection, as well as a significant change in the predicted human TBC1D3 protein sequence. Lastly, we find that, despite multiple duplications, human TBC1D3 expression is limited to a subset of copies and, most notably, from a single paralog group: TBC1D3-CDKL . These observations may help explain why a gene potentially important in cortical development can be so variable in the human population.

6.
bioRxiv ; 2024 Feb 16.
Artigo em Inglês | MEDLINE | ID: mdl-38529499

RESUMO

Haplotype information is crucial for biomedical and population genetics research. However, current strategies to produce de-novo haplotype-resolved assemblies often require either difficult-to-acquire parental data or an intermediate haplotype-collapsed assembly. Here, we present Graphasing, a workflow which synthesizes the global phase signal of Strand-seq with assembly graph topology to produce chromosome-scale de-novo haplotypes for diploid genomes. Graphasing readily integrates with any assembly workflow that both outputs an assembly graph and has a haplotype assembly mode. Graphasing performs comparably to trio-phasing in contiguity, phasing accuracy, and assembly quality, outperforms Hi-C in phasing accuracy, and generates human assemblies with over 18 chromosome-spanning haplotypes.

7.
Cell ; 187(6): 1547-1562.e13, 2024 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-38428424

RESUMO

We sequenced and assembled using multiple long-read sequencing technologies the genomes of chimpanzee, bonobo, gorilla, orangutan, gibbon, macaque, owl monkey, and marmoset. We identified 1,338,997 lineage-specific fixed structural variants (SVs) disrupting 1,561 protein-coding genes and 136,932 regulatory elements, including the most complete set of human-specific fixed differences. We estimate that 819.47 Mbp or ∼27% of the genome has been affected by SVs across primate evolution. We identify 1,607 structurally divergent regions wherein recurrent structural variation contributes to creating SV hotspots where genes are recurrently lost (e.g., CARD, C4, and OLAH gene families) and additional lineage-specific genes are generated (e.g., CKAP2, VPS36, ACBD7, and NEK5 paralogs), becoming targets of rapid chromosomal diversification and positive selection (e.g., RGPD gene family). High-fidelity long-read sequencing has made these dynamic regions of the genome accessible for sequence-level analyses within and between primate species.


Assuntos
Genoma , Primatas , Animais , Humanos , Sequência de Bases , Primatas/classificação , Primatas/genética , Evolução Biológica , Análise de Sequência de DNA , Variação Estrutural do Genoma
8.
medRxiv ; 2024 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-38496498

RESUMO

Less than half of individuals with a suspected Mendelian condition receive a precise molecular diagnosis after comprehensive clinical genetic testing. Improvements in data quality and costs have heightened interest in using long-read sequencing (LRS) to streamline clinical genomic testing, but the absence of control datasets for variant filtering and prioritization has made tertiary analysis of LRS data challenging. To address this, the 1000 Genomes Project ONT Sequencing Consortium aims to generate LRS data from at least 800 of the 1000 Genomes Project samples. Our goal is to use LRS to identify a broader spectrum of variation so we may improve our understanding of normal patterns of human variation. Here, we present data from analysis of the first 100 samples, representing all 5 superpopulations and 19 subpopulations. These samples, sequenced to an average depth of coverage of 37x and sequence read N50 of 54 kbp, have high concordance with previous studies for identifying single nucleotide and indel variants outside of homopolymer regions. Using multiple structural variant (SV) callers, we identify an average of 24,543 high-confidence SVs per genome, including shared and private SVs likely to disrupt gene function as well as pathogenic expansions within disease-associated repeats that were not detected using short reads. Evaluation of methylation signatures revealed expected patterns at known imprinted loci, samples with skewed X-inactivation patterns, and novel differentially methylated regions. All raw sequencing data, processed data, and summary statistics are publicly available, providing a valuable resource for the clinical genetics community to discover pathogenic SVs.

9.
bioRxiv ; 2024 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-38464314

RESUMO

Down syndrome is the most common form of human intellectual disability caused by precocious segregation and nondisjunction of chromosome 21. Differences in centromere structure have been hypothesized to play a potential role in this process in addition to the well-established risk of advancing maternal age. Using long-read sequencing, we completely sequenced and assembled the centromeres from a parent-child trio where Trisomy 21 arose in the child as a result of a meiosis I error. The proband carries three distinct chromosome 21 centromere haplotypes that vary by 11-fold in length--both the largest (H1) and smallest (H2) originating from the mother. The longest H1 allele harbors a less clearly defined centromere dip region (CDR) as defined by CpG methylation and a significantly reduced signal by CENP-A chromatin immunoprecipitation sequencing when compared to H2 or paternal H3 centromeres. These epigenetic signatures suggest less competent kinetochore attachment for the maternally transmitted H1. Analysis of H1 in the mother indicates that the reduced CENP-A ChIP-seq signal, but not the CDR profile, pre-existed the meiotic nondisjunction event. A comparison of the three proband centromeres to a population sampling of 35 completely sequenced chromosome 21 centromeres shows that H2 is the smallest centromere sequenced to date and all three haplotypes (H1-H3) share a common origin of ~15 thousand years ago. These results suggest that recent asymmetry in size and epigenetic differences of chromosome 21 centromeres may contribute to nondisjunction risk.

10.
Genetics ; 226(4)2024 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-38298127

RESUMO

Short tandem repeats (STRs) are hotspots of genomic variability in the human germline because of their high mutation rates, which have long been attributed largely to polymerase slippage during DNA replication. This model suggests that STR mutation rates should scale linearly with a father's age, as progenitor cells continually divide after puberty. In contrast, it suggests that STR mutation rates should not scale with a mother's age at her child's conception, since oocytes spend a mother's reproductive years arrested in meiosis II and undergo a fixed number of cell divisions that are independent of the age at ovulation. Yet, mirroring recent findings, we find that STR mutation rates covary with paternal and maternal age, implying that some STR mutations are caused by DNA damage in quiescent cells rather than polymerase slippage in replicating progenitor cells. These results echo the recent finding that DNA damage in oocytes is a significant source of de novo single nucleotide variants and corroborate evidence of STR expansion in postmitotic cells. However, we find that the maternal age effect is not confined to known hotspots of oocyte mutagenesis, nor are postzygotic mutations likely to contribute significantly. STR nucleotide composition demonstrates divergent effects on de novo mutation (DNM) rates between sexes. Unlike the paternal lineage, maternally derived DNMs at A/T STRs display a significantly greater association with maternal age than DNMs at G/C-containing STRs. These observations may suggest the mechanism and developmental timing of certain STR mutations and contradict prior attribution of replication slippage as the primary mechanism of STR mutagenesis.


Assuntos
Repetições de Microssatélites , Taxa de Mutação , Humanos , Feminino , Criança , Mutação , Pais , Meiose , Nucleotídeos
11.
Am J Med Genet A ; : e63514, 2024 Feb 08.
Artigo em Inglês | MEDLINE | ID: mdl-38329159

RESUMO

Genetics has become a critical component of medicine over the past five to six decades. Alongside genetics, a relatively new discipline, dysmorphology, has also begun to play an important role in providing critically important diagnoses to individuals and families. Both have become indispensable to unraveling rare diseases. Almost every medical specialty relies on individuals experienced in these specialties to provide diagnoses for patients who present themselves to other doctors. Additionally, both specialties have become reliant on molecular geneticists to identify genes associated with human disorders. Many of the medical geneticists, dysmorphologists, and molecular geneticists traveled a circuitous route before arriving at the position they occupied. The purpose of collecting the memoirs contained in this article was to convey to the reader that many of the individuals who contributed to the advancement of genetics and dysmorphology since the late 1960s/early 1970s traveled along a journey based on many chances taken, replying to the necessities they faced along the way before finding full enjoyment in the practice of medical and human genetics or dysmorphology. Additionally, and of equal importance, all exhibited an ability to evolve with their field of expertise as human genetics became human genomics with the development of novel technologies.

12.
Cell ; 187(5): 1024-1037, 2024 Feb 29.
Artigo em Inglês | MEDLINE | ID: mdl-38290514

RESUMO

This perspective focuses on advances in genome technology over the last 25 years and their impact on germline variant discovery within the field of human genetics. The field has witnessed tremendous technological advances from microarrays to short-read sequencing and now long-read sequencing. Each technology has provided genome-wide access to different classes of human genetic variation. We are now on the verge of comprehensive variant detection of all forms of variation for the first time with a single assay. We predict that this transition will further transform our understanding of human health and biology and, more importantly, provide novel insights into the dynamic mutational processes shaping our genomes.


Assuntos
Variação Estrutural do Genoma , Genômica , Humanos , Genômica/métodos , Mutação em Linhagem Germinativa , Mutação , Tecnologia
13.
Nat Commun ; 14(1): 8111, 2023 Dec 07.
Artigo em Inglês | MEDLINE | ID: mdl-38062027

RESUMO

Topological associating domains (TADs) are self-interacting genomic units crucial for shaping gene regulation patterns. Despite their importance, the extent of their evolutionary conservation and its functional implications remain largely unknown. In this study, we generate Hi-C and ChIP-seq data and compare TAD organization across four primate and four rodent species and characterize the genetic and epigenetic properties of TAD boundaries in correspondence to their evolutionary conservation. We find 14% of all human TAD boundaries to be shared among all eight species (ultraconserved), while 15% are human-specific. Ultraconserved TAD boundaries have stronger insulation strength, CTCF binding, and enrichment of older retrotransposons compared to species-specific boundaries. CRISPR-Cas9 knockouts of an ultraconserved boundary in a mouse model lead to tissue-specific gene expression changes and morphological phenotypes. Deletion of a human-specific boundary near the autism-related AUTS2 gene results in the upregulation of this gene in neurons. Overall, our study provides pertinent TAD boundary evolutionary conservation annotations and showcases the functional importance of TAD evolution.


Assuntos
Genoma , Genômica , Animais , Camundongos , Humanos , Regulação da Expressão Gênica , Epigenômica , Sequenciamento de Cromatina por Imunoprecipitação , Cromatina , Mamíferos/genética
14.
Genes (Basel) ; 14(12)2023 Dec 07.
Artigo em Inglês | MEDLINE | ID: mdl-38137007

RESUMO

The common marmoset (Callithrix jacchus) is one of the most widely used nonhuman primate models of human disease. Owing to limitations in sequencing technology, early genome assemblies of this species using short-read sequencing suffered from gaps. In addition, the genetic diversity of the species has not yet been adequately explored. Using long-read genome sequencing and expert annotation, we generated a high-quality genome resource creating a 2.898 Gb marmoset genome in which most of the euchromatin portion is assembled contiguously (contig N50 = 25.23 Mbp, scaffold N50 = 98.2 Mbp). We then performed whole genome sequencing on 84 marmosets sampling the genetic diversity from several marmoset research centers. We identified a total of 19.1 million single nucleotide variants (SNVs), of which 11.9 million can be reliably mapped to orthologous locations in the human genome. We also observed 2.8 million small insertion/deletion variants. This dataset includes an average of 5.4 million SNVs per marmoset individual and a total of 74,088 missense variants in protein-coding genes. Of the 4956 variants orthologous to human ClinVar SNVs (present in the same annotated gene and with the same functional consequence in marmoset and human), 27 have a clinical significance of pathogenic and/or likely pathogenic. This important marmoset genomic resource will help guide genetic analyses of natural variation, the discovery of spontaneous functional variation relevant to human disease models, and the development of genetically engineered marmoset disease models.


Assuntos
Callithrix , Genômica , Animais , Humanos , Callithrix/genética , Mapeamento Cromossômico , Genoma Humano
15.
Sci Adv ; 9(44): eadh9543, 2023 11 03.
Artigo em Inglês | MEDLINE | ID: mdl-37910626

RESUMO

The genetic mechanisms underlying the expansion in size and complexity of the human brain remain poorly understood. Long interspersed nuclear element-1 (L1) retrotransposons are a source of divergent genetic information in hominoid genomes, but their importance in physiological functions and their contribution to human brain evolution are largely unknown. Using multiomics profiling, we here demonstrate that L1 promoters are dynamically active in the developing and the adult human brain. L1s generate hundreds of developmentally regulated and cell type-specific transcripts, many that are co-opted as chimeric transcripts or regulatory RNAs. One L1-derived long noncoding RNA, LINC01876, is a human-specific transcript expressed exclusively during brain development. CRISPR interference silencing of LINC01876 results in reduced size of cerebral organoids and premature differentiation of neural progenitors, implicating L1s in human-specific developmental processes. In summary, our results demonstrate that L1-derived transcripts provide a previously undescribed layer of primate- and human-specific transcriptome complexity that contributes to the functional diversification of the human brain.


Assuntos
Retroelementos , Transcriptoma , Animais , Humanos , Retroelementos/genética , Elementos Nucleotídeos Longos e Dispersos/genética , Neurônios , Primatas/genética
16.
Int J Mol Sci ; 24(21)2023 Oct 31.
Artigo em Inglês | MEDLINE | ID: mdl-37958807

RESUMO

The impact of segmental duplications on human evolution and disease is only just starting to unfold, thanks to advancements in sequencing technologies that allow for their discovery and precise genotyping. The 15q11-q13 locus is a hotspot of recurrent copy number variation associated with Prader-Willi/Angelman syndromes, developmental delay, autism, and epilepsy and is mediated by complex segmental duplications, many of which arose recently during evolution. To gain insight into the instability of this region, we characterized its architecture in human and nonhuman primates, reconstructing the evolutionary history of five different inversions that rearranged the region in different species primarily by accumulation of segmental duplications. Comparative analysis of human and nonhuman primate duplication structures suggests a human-specific gain of directly oriented duplications in the regions flanking the GOLGA cores and HERC segmental duplications, representing potential genomic drivers for the human-specific expansions. The increasing complexity of segmental duplication organization over the course of evolution underlies its association with human susceptibility to recurrent disease-associated rearrangements.


Assuntos
Transtorno Autístico , Síndrome de Prader-Willi , Animais , Humanos , Variações do Número de Cópias de DNA/genética , Primatas/genética , Síndrome de Prader-Willi/genética , Duplicações Segmentares Genômicas/genética , Transtorno Autístico/genética , Cromossomos Humanos Par 15/genética , Duplicação Gênica
17.
Sci Adv ; 9(47): eadj1261, 2023 11 24.
Artigo em Inglês | MEDLINE | ID: mdl-37992162

RESUMO

The biological role of the repetitive DNA sequences in the human genome remains an outstanding question. Recent long-read human genome assemblies have allowed us to identify a function for one of these repetitive regions. We have uncovered a tandem array of conserved primate-specific retrogenes encoding the protein Elongin A3 (ELOA3), a homolog of the RNA polymerase II (RNAPII) elongation factor Elongin A (ELOA). Our genomic analysis shows that the ELOA3 gene cluster is conserved among primates and the number of ELOA3 gene repeats is variable in the human population and across primate species. Moreover, the gene cluster has undergone concerted evolution and homogenization within primates. Our biochemical studies show that ELOA3 functions as a promoter-associated RNAPII pause-release elongation factor with distinct biochemical and functional features from its ancestral homolog, ELOA. We propose that the ELOA3 gene cluster has evolved to fulfil a transcriptional regulatory function unique to the primate lineage that can be targeted to regulate cellular hyperproliferation.


Assuntos
Fatores de Alongamento de Peptídeos , RNA Polimerase II , Animais , Humanos , RNA Polimerase II/genética , RNA Polimerase II/metabolismo , Fatores de Alongamento de Peptídeos/genética , Primatas/genética , Elonguina/genética , Família Multigênica , Sequências de Repetição em Tandem/genética
18.
Am J Hum Genet ; 110(11): 1832-1840, 2023 11 02.
Artigo em Inglês | MEDLINE | ID: mdl-37922882

RESUMO

Advances in long-read sequencing and assembly now mean that individual labs can generate phased genomes that are more accurate and more contiguous than the original human reference genome. With declining costs and increasing democratization of technology, we suggest that complete genome assemblies, where both parental haplotypes are phased telomere to telomere, will become standard in human genetics. Soon, even in clinical settings where rigorous sample-handling standards must be met, affected individuals could have reference-grade genomes fully sequenced and assembled in just a few hours given advances in technology, computational processing, and annotation. Complete genetic variant discovery will transform how we map, catalog, and associate variation with human disease and fundamentally change our understanding of the genetic diversity of all humans.


Assuntos
Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Análise de Sequência de DNA , Genoma Humano/genética , Telômero/genética
19.
Emerg Top Life Sci ; 7(3): 361-381, 2023 Dec 14.
Artigo em Inglês | MEDLINE | ID: mdl-37905568

RESUMO

Long-read sequencing platforms provide unparalleled access to the structure and composition of all classes of tandemly repeated DNA from STRs to satellite arrays. This review summarizes our current understanding of their organization within the human genome, their importance with respect to disease, as well as the advances and challenges in understanding their genetic diversity and functional effects. Novel computational methods are being developed to visualize and associate these complex patterns of human variation with disease, expression, and epigenetic differences. We predict accurate characterization of this repeat-rich form of human variation will become increasingly relevant to both basic and clinical human genetics.


Assuntos
DNA , Sequências de Repetição em Tandem , Humanos , Sequências de Repetição em Tandem/genética , Epigênese Genética
20.
Nature ; 621(7978): 355-364, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37612510

RESUMO

The prevalence of highly repetitive sequences within the human Y chromosome has prevented its complete assembly to date1 and led to its systematic omission from genomic analyses. Here we present de novo assemblies of 43 Y chromosomes spanning 182,900 years of human evolution and report considerable diversity in size and structure. Half of the male-specific euchromatic region is subject to large inversions with a greater than twofold higher recurrence rate compared with all other chromosomes2. Ampliconic sequences associated with these inversions show differing mutation rates that are sequence context dependent, and some ampliconic genes exhibit evidence for concerted evolution with the acquisition and purging of lineage-specific pseudogenes. The largest heterochromatic region in the human genome, Yq12, is composed of alternating repeat arrays that show extensive variation in the number, size and distribution, but retain a 1:1 copy-number ratio. Finally, our data suggest that the boundary between the recombining pseudoautosomal region 1 and the non-recombining portions of the X and Y chromosomes lies 500 kb away from the currently established1 boundary. The availability of fully sequence-resolved Y chromosomes from multiple individuals provides a unique opportunity for identifying new associations of traits with specific Y-chromosomal variants and garnering insights into the evolution and function of complex regions of the human genome.


Assuntos
Cromossomos Humanos Y , Evolução Molecular , Humanos , Masculino , Cromossomos Humanos Y/genética , Genoma Humano/genética , Genômica , Taxa de Mutação , Fenótipo , Eucromatina/genética , Pseudogenes , Variação Genética/genética , Cromossomos Humanos X/genética , Regiões Pseudoautossômicas/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...